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Abstract 

Linear-Nonlinear-Poisson (LNP) models are a popular and powerful tool for describing 
encoding (stimulus-response) transformations by single sensory as well as motor 
neurons. Recently, there has been rising interest in the second- and higher-order 
correlation structure of neural spike trains, and how it may be related to specific 
encoding relationships. The distortion of signal correlations as they are transformed 
through particular LNP models is predictable and in some cases analytically tractable 
and invertible. Here, we propose that LNP encoding models can potentially be identified 
strictly from the correlation transformations they induce, and develop a computational 
method for identifying minimum-phase single-neuron temporal kernels under white 
and colored- random Gaussian excitation. Unlike reverse-correlation or maximum- 
likelihood, correlation-distortion based identification does not require the simultaneous 
observation of stimulus-response pairs - only their respective second order statistics. 
Although in principle filter kernels are not necessarily minimum-phase, and only their 
spectral amplitude can be uniquely determined from output correlations, we show that 
in practice this method provides excellent estimates of kernels from a range of 
parametric models of neural systems. We conclude by discussing how this approach 
could potentially enable neural models to be estimated from a much wider variety of 
experimental conditions and systems, and its limitations. 

Keywords: system identification; correlation function; neural population; receptive 
field; point process; auto-regressive model 
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1 Introduction 

The characterization of neural encoding relationships is a central problem in experimental 
systems neuroscience. While general nonlinear systems characterization in terms of volterra- 
series expansions or detailed functional biophysical models have been applied to this 
problem, in practice, much of neural encoding research has been focused on reduced models 
like the Linear-Nonlinear-Poisson (LNP) cascade or its non-Poisson and mutually exciting 
generalizations like the Generalized Linear Models (Paninski et al. 2007). Encoding models 
are typically estimated using experiments where the neuron's response to a Gaussian stimulus 
process is measured: reverse-correlation is often applied to identify the LNP's linear kernel 
(Ringach and Shapley 2004; Schwartz et al. 2006; Wu et al. 2006), or maximum-likelihood- 
based optimization is used to identify the parameters of the more complex models. 

In recent work (Krumin and Shoham 2009), we have obtained analytical results for the 
correlation structure of (multiple) LNP model outputs when the inputs are white Gaussian 
noise processes and the nonlinearities are either exponential, square or absolute-value. Using 
these results, we were able to generate synthetic spike trains with a fully-controllable 
correlation structure (defined by auto- and cross-correlation functions) by selecting 
appropriate hnear kernels and nonlinearity parameters. This work was primarily motivated by 
our interest in neural pattern photo-stimulation (Shoham et al. 2005a; Farah et al. 2007), 
where synthetic spike trains can be used, for example, to emulate input activity onto a neuron, 
or to control neuron populations in artificial neuroprosthetic interfaces. 

In this paper we extend our previous correlation-distortion results (relevant theory 
reviewed in section 2) and adapt them fi-om the domain of synthetic spike train generation to 
the neural system identification problem. Viewed differently, this general approach can be 
used to determine parameters in a specific LNP encoding model excited by white Gaussian 
noise, given only the correlation structure of observed ou^ut spike trains. Unlike other neural 
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system-identification methods, correlation-distortion based identification is not based on the 
joint statistics of stimulus-response pairs and thus does not require the simultaneous 
observation of stimulus and response. However, the method used in (Krumin and Shoham 
2009) only considered uncorrelated white noise inputs and more-importantly calculates non- 
causal linear kernels which are implausible as neural encoding models. In section 3 we 
address this issue by using auto-regressive modeling of both input and output processes to 
develop a method for identifying plausible, minimum-phase, single-channel linear kernels, 
which we test in section 4. In section 5 we discuss the general features and possible 
applications of correlation-distortion based identification of neural encoding models and the 
fundamental and practical limitations of this approach. In what follows, we focus our 
discussion on the exponential nonlinearity, but the results can be easily extended to the other 
nonlinearities considered in (Krumin and Shoham 2009) and beyond. 

2 Background - correlation distortions in a Linear-Exponential-Poisson model 

In (Krumin and Shoham 2009) we calculated the auto- and cross-correlation functions of 
multiple-input multiple-output Linear-Nonlinear-Poisson systems excited by uncorrelated 
Gaussian noise. The uncorrelated processes are 'colored' by the linear stage and their 
correlation structure is systematically distorted by the static nonlinearity (Johnson 1994) in 
generating the rate processes. Interestingly, the expected auto- and cross-correlation 
functions of the Poisson spike trains have exactly the same structure as for the rates, except 
for the zero lag autocorrelation (proved in (Krumin and Shoham 2009)). 

We will begin by briefly repeating the main derivation of the exponential correlation 

distortions. When transforming the Gaussian variables X. ~ N(0,lj with correlations 
□ E\X. -X.l using the following transformation: 
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A.=exp(//.+a-.^.) 



The resulting variable has a log-normal distribution with expectation: 



(1) 



E [A. ] = exp (//. ) ■ E [exp (o-.X. ) J = exp (//. ) • exp -E {a.X. ) 



= exp 



(2) 



Where we have used Wick's theorem (Simon 1974) for mean zero, normal random 
variables v~A^(0,c7^): 



^[exp(v)] = exp -^[v'] 



(3) 



Using the same theorem, we can also derive expressions for the distorted correlations in 
terms of the correlations of Gaussian random variables: 

R^j=E [a. • A^. ] = ^ [exp (//. ) exp (cr.X. ) • exp (jUj ) exp (cr^JT^. )] = 

= exp (//, +//.)• exp £ \(a,X, + cjjX. )' ll = £ [ A, ] £ [ A . ] • exp (ct,(7 ) 



And for the case i = j( r^i = 1 ): 



i^,n£[A,.^] = £^[Aj.exp(af) 



(5) 



Let us now assume that we observe the mean output rates and correlation structure, and 
want to estimate from them the pre-distorted statistical structure, and the parameters of the 
exponent. Equations (2) and (5) together allow us to evaluate the parameters and cr,. : 



//, = hi(£[Aj)-^ = ln 



(6) 



Finally, from equation (4) we derive the pre-distorted correlations as: 

f ^ \ 



1 



•In 



£[A,]£[a.] 



(7) 
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3 Identifying a single Linear-Exponential-Poisson model 

We consider the problem of identifying an LNP model with rate: 

A(?) = exp(//+cr-x(?)) = exp(//+cr-(/z*5)(?)) (8) 

from the measured auto-correlation fijnction of the output spike train (r) and the auto- 
correlation function i?^ (r) of the Gaussian stimulus process5(^). Here, x(^) is the output 

after applying the linear kernel A (?) to the stimulus {0<t<K-At). 

To solve this identification problem, we begin by using equation (6) to determine the 
parameters of the exponential nonlinearity, using the mean rate estimate: £'[A.] = £'[AA^]/A? 

, and the rate's second moment estimate: R.. = E^AN^~^- E[ANy At^ . Next, we observe that 

equation (7) relates between the correlation functions of x(?) : R^ (r) □ + and 

the rate auto-correlation function 7?^ (r) □ + , which in turn has the same 

expectation as the measurable auto-correlation fijnctions R^[t^ of the doubly stochastic 
Poisson process (for all r^O ). 

3. 1 1dentifying the linear filter 

Given an estimate of the autocorrelation R^ (r) of the linear stage output, the problem of 
identifying the filter h{t^ can be addressed using a number of frequency- or time-domain 
methods (Ljung 1999). The solution to this problem is fimdamentally non-unique - since 
clearly, for example, isometric fransformations like adding a delay to h{t^ or inverting the 

time axis will not alter the auto-correlation function. It is easy to see that in the frequency 
domain we have enough information to determine the correct magnitudes 

|A(/)| = ^F [iJ^C^")] but not the phases, leaving an infinite family of filters possible. In 
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(Krumin and Shoham 2009) we estimated h(^t^ using the matrix square root (Cholesky 
decomposition) of the auto-correlation matrix of that is, a Toeplitz matrix which has 

(A:A2") along the A:-th diagonal. Intuitively, we require that neural filters be causal, but this 

approach generally leads to non-causal filters. Another intuitive notion with regard to neural 
systems is that of stability with regard to perturbations (in general, neural systems can clearly 
be unstable - but those aspects probably cannot be studied using classical neural system 
identification experimental paradigms). Causal all-pole, auto-regressive (AR) filters that are 
also stable are also 'minimum phase', that is, they introduce minimal group delay to their 
input by having most of their response energy concentrated near the start of the impulse 
response. The minimum-phase constraint is thus a more plausible constraint for obtaining 
unique neural-like filters with the correct Irequency transfer function, which can easily be 
estimated using auto-regressive modeling. We consider separately the cases of uncorrelated 
(white) and correlated (colored) stimulus processes (the process is schematically summarized 
in Figure 1). 

The output of a discrete-time causal system driven by white Gaussian noise can be 
modeled using an auto-regressive (AR) model: 

p 

X \n] = -y^^ a^x \n — k] + s \n] (9) 

k=l 

With the following z-domain transfer fijnction: 

/ N ^(Z) 1 1 



The parameters of the model ( ) can be estimated directly from the auto-correlation 
function of the output (r) by solving the Yule-Walker equations (Makhoul 1975): 



7 



■ ^.(0) ^.(1) ... Kip-^) 
^.(1) ^.(0) ... K{p-^) 

When the stimulus s{t^ is correlated, we can consider it as a white Gaussian noise 
process w(?) that is filtered by a linear filter »S'(z) . The stimulus is subsequently filtered by 
the linear filter ^(z), resulting in another correlated Gaussian process x(^) that enters the 
static non-linearity of the LNP model. To estimate the linear filter h{t^ from the auto- 
correlation function of the stimulus (i?^(r)) and the estimated auto-correlation of the hnear 
stage output (i?^(r)), we model both s{t^ and x{t^ as auto-regressive processes with 

transfer functions — and — \—- respectively, whose parameters are estimated 

separately from the auto-correlations. Dividing the latter by the former results in an auto- 
regressive-moving-average (ARM A) model that corresponds to the linear filter A(^) which 

transforms between the two signals, with transfer function H(z) = — . 













MpI 



(11) 
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Figure 1 AR and ARMA modeling of the linear stage, (a) In the case of an uncorrelated stimulus s(t), the 
hnear filter H(z) is approximated using an auto-regressive model 1/A(z), which is estimated from the output 
autocorrelation function Rx(t). (b) In the case of a correlated stimulus s(t), the linear filter H(z) is approximated 
by an ARMA model As(z)/Ax(z), which is estimated from the auto-correlation functions of the stimulus (Rs(t)) 
and the output (Rx(x)) 



While these infinite-impulse-response AR or ARMA models fully capture the properties 
of the linear filter A(^), we also transformed them into an equivalent finite impulse response 

(FIR) moving average (MA) representation, which is the familiar representation encountered 
in the field, by calculating the response to an impulse and truncating after the response 
decays. 



3.2 Smoothing and stabilizing the solutions. 

Efficient estimation of the rate process' auto-correlation function ^^(r) is a crucial step 

in estimating the linear kernel. Spike trains' auto-correlation estimates tend to be very noisy 
and if directly used to estimate the rate autocorrelation lead to imprecise and unstable AR 
models. To stabilize the solution, we used the following steps: 
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1. A smoothing spline was applied with two constraints: a) R^(^0) = R^(^0)-E[ANyAt^ 

and b) i?^ (r = r^) = [ii (AA^)/A?]^ - the auto-correlation function has finite effective 
support and is constant outside. 

2. The square system of Yule- Walker equations does not use the entire support of the auto- 
correlation function for a low order AR model, but for a high-order model it becomes ill- 
posed. We regularize the solution by lowering the order of the estimated auto -regressive 
model while solving an over-determined version of the Yule- Walker equations: 



K{p-2) 



Mp-^) K(p-^) ■■■ K{p-p\ 













Mp). 



P<P (12) 



4 Results 

We first tested these modeling and stabilization steps using simulations of an exponential 
LNP neuron model applied to white (Figure 2) and colored (Figure 3) zero-mean Gaussian- 
distributed pseudo-random stimuli processes. In both tests the sampling rate was 500Hz, the 
linear FIR kernel length was 25 samples, average firing rate was 20Hz, and the rate variance 
was 400 Hz . The record lengths used for the simulations were 1 and 10 hours respectively, 
and the AR/MA models used were of order 15 and 20, respectively. Our results demonstrate 
that in principle, when the neural kernel is exactly minimum-phase, this new approach allows 
accurate reconstruction of the linear kernel from autocorrelations, including the more noisy 
output spiking correlations (where the estimate appears noisier, at higher frequencies). The 
correlation-based estimator is generally noisier than the Spike Triggered Average, in part 
because it uses much less data (no stimulus-spike correlations). In the case of correlated-noise 
input we observed an estimator bias near x=0. 
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Figure 2 Estimation of tlie linear filter kernel: white noise stimulus, (a) Original filter, static non-linearity, 
auto-correlation ftinction normalized by E^[^] and filter fi-equency response, (b) Spike-Triggered-Average 
estimates, (c) Estimation from the (imobservable) auto-correlation function of the output rate process, (d) 
Estimation from the autocorrelation of the output spike train. 
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Figure 3 Estimation of the linear filter kernel: colored noise stimulus, (a) Stimulus auto-correlation passes 
through the linear filter and static non-linearity. Output auto-correlation function is normalized by E^[X]. (b) 
Spike-Triggered-Average estimates, (c) Estimation fi-om the auto -correlation function of the output rate process, 
(d) Estimation trom the autocorrelation of the output spike train (shadowed region represents the standard 
deviation of the estimator). 



Next, we set out to test the method's applicability to estimating neural models. To address 
this issue, we generated a range of filter kernels from four parametric families of models 
previously used in the Computational Neuroscience literature to model temporal filter 
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kernels. Model family #1 was used to model neurons in the retina and visual cortex in studies 
of contrast gain control (Yu et al. 2005; Yu and Sing Lee 2005): 

h{t)^sm(7rat)Gxp[-/^t), t>0 (13) 
Model family #2 was used to capture motion sensitivity in VI neurons (Adelson and 
Bergen 1985; Dayan and Abbot 2001): 



h[t) = exp(c!;^) 



1^ / \t / \k+2 ^ 

[at) [at) 



t > (14) 



k\ (n + 2)! 

Where the parameter k takes values of 3 or 5. 

Model family #3 was used by Dayan and Abbott (2001) as a model of the temporal 
structure of retina and LGN neurons: 

h[t) = a^tQx:p[-at)-phQx:p[-pt), t>0 (15) 
Finally, for completeness, we studied models from a fourth family that generalizes the 
alpha-function type kernels that are often used to model synaptic EPSPs (Dayan and Abbot 
2001): 

h(t) = (aty QyiTp(-at), t>0 (16) 
We tested the method on n = 99 kernels from family 1 (parameter range: 
lO<a,y0<5Osec '), n-l02 kernels from family 2 (parameter range: k = 3,5, 
50 < a < 100 sec"'), n = 99 kernels from family 3 (parameter range: 10<y^<50sec"\ 
10 < or < 250 sec"\ a> J3 ), and n = 105 kernels from family 4 (parameter range: 1 < A: < 5 , 
20 < q; < 200 sec"'). Two representative examples for each family of the original kernel, and 
the kernel estimated from the stochastic firing rate correlations are shown in Figure 4 (as the 
correlation-distortion based method cannot recover shift and scale, the estimated kernel was 
shifted and scaled to best match the original). Overall, the precise temporal structure of model 
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filters was successfiiUy estimated fi-om the firing rate autocorrelation fijnctions with 
correlation coefficients in the range of p > 0.95 (Figure 4). 

Family #1 Family #2 




300 300 




° time[ms] ° time [ms] ^ 

Figure 4 Estimation of the linear filter kernel: examples. Representative filters from four different parametric 
families (discussed in the text, equations (13)-(16)) are successfully estimated form the autocorrelation functions 
of the output and the stimulus. Note that the estimated filters are shifted in time and scaled in order to fit best to 
the original filters (shadowed region represents the standard deviation of the estimator). Insets: histograms of the 
correlation coefficient between the original and the estimated filters, separately to each parametric family. 

To further understand the limitations of estimating neural models using this approach, we 
studied the worst-case estimate in our original data set (family #2, with k = 5, a = 50 sec"'). 
As can be seen in Figure 5a, the original filter is not minimum-phase with some of its zeros 
residing outside the unit circle (Figure 5d). Applying the method to estimate the filter results 
in estimating its minimum-phase version. This minimum-phase filter has a very similar shape 
to the original filter and all its zeros lie inside the unit circle. Such similarity in shape 
between minimum-phase and not minimum-phase filters is typical for the four parametric 
families discussed above. 
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(a) (b) 




time[ms] ^ ° time [ms] 




-300 300 

T[ms] 



Figure 5 Nonminimum-phase kernel estimation, (a) Nonminimum-phase kernel and its minimum-phase 
form. Note that not only the delay is changed, but also the shape of the filter, (b) The estimated kernel fits the 
minimum-phase kernel, (c) The autocorrelation function is the same for all three linear kernels, (d) The zeros 
diagram for the three filters. Note that only for the original kernel part of its zeros lie outside the unit circle 
(dashed line), thus making it nonminimum-phase (one of the original filter zeros lies far on the left and is not 
presented here). 



5 Discussion 

In this paper we propose and develop a new method for indirectly identifying neuron 
encoding models from input-output correlation transformations by adapting our recent 
correlation-distortion results for LNP neuron models (Krumin and Shoham 2009). Our 
discussion focused on LNP encoding models with an exponential nonlinearity. Encoding 
relationships with exponential or exponential-like nonlinearities have been observed in the 
motor system (Paninski et al. 2004; Shoham et al. 2005b) and in various visual-system 
neurons (Gabbiani et al. 2002; Rust et al. 2006; Pillow et al. 2008), and may provide a good 
approximation in additional neuron models. Our development has been focused on a simple 
one-dimensional LNP neuron model and did not get into more complex self- and mutually- 
exciting encoding models (like the GLM). It is also instructive to mention related correlation 
distortion-type formulation by a number of authors exploring correlations induced in binary 
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spike trains generated by directly thresholding Gaussian processes, without going through the 
intermediate step of a rate process (Johnson 1994; Dom and Ringach 2003; Tchumatchenko 
et al. 2008; Macke et al. 2009). Because the second step is deterministic, the structure is 
single-stochastic, in difference with the doubly-stochastic Poisson structure explored here. 
Calculating the Gaussian-process correlations in this framework is somewhat more involved 
(requires numerical solutions of nonlinear equations). Experimental tests of our proposed 
method are currently underway. 

To the best of our knowledge, this is a first example of a method proposed for blindly 
identifying a neuron encoding model, a task which is typically addressed using cross 
correlation-type analysis (Nykamp and Ringach 2002; Ringach 2004). This analysis requires 
precise measurement of each stimulus-response pair, which is challenging, for example, in 
cases where the subject is fi-ee to move and/or monitoring the exact stimulus to which the 
neural system is exposed to is prohibitively difficult. In this case the ability to recover 
encoding models only by independently measuring the input correlation structure and the 
response correlation structure, comprises a significant advantage over existing methods. 
Another possible scenario of interest is attempts to estimate the encoding function to 
internally generated, unobservable, stimuli like inputs irom a lower level neural system, 
volitional movements in a paralyzed individual or dreams. Estimating the input correlations is 
an obvious (but not hopeless) challenge in these scenarios, as is correcting for the inevitable 
deviations fi-om our underljdng assumption of a stationary, Gaussian input process. Previous 
applications of a blind signal processing method in neurophysiological data analysis (namely, 
independent components analysis) addressed problems like the identification of functional 
groups within a large neuron population (Laubach et al. 1999) and spike sorting (Brown et al. 
2001). 
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Fundamental and practical limitations 

The main limitation that emerges in applying the correlation-distortion strategy is the non 
uniqueness of the kernel solutions with respect to changes that do not alter the output 
correlations, such as, for example, adding a time delay. Principally, the identified neural 
models are only unique and correct in terms of their frequency-magnitude response, and there 
is no one way of determining the phases from the correlation distortions. To determine a 
unique temporal filter with plausible causality, stability and time concentration properties 
(not necessarily the actual temporal kernel) we estimated (minimum-phase) AR or ARMA 
models, and then calculated their impulse response. While in general neural models are not 
necessarily minimum-phase (Victor 1989), our results suggest that this approach could under 
certain circumstances be effective at the estimation of a wide range of plausible neural 
models. Alternatively, one can estimate the same non-causal FIR filter kernels by calculating 
the matrix square root of the auto -correlation matrix (Krumin and Shoham 2009) and then 
find a minimum-phase FIR filter with the same magnitude frequency response by reflecting 
all the zeros into the unit circle. In future work, we will explore methods for removing some 
of the general indeterminacy in the estimation of the filter's phase structure using additional 
information from the higher-order correlation structure, multi-neural correlations or other 
sources. 

Finally, we note that this approach is less general (requires more assumptions and 
choices) than the STA and/or ML estimators, including the assumption of a Gaussian input 
disfribution, and having to select a specific non-linearity. In particular, a nonlinearity 
mismatch can result in a biased estimate, however, when we tested for the mismatch 
consequences we have seen fairly minor effects when the data was simulated using a non- 
exponential nonlinearity (Figure 6). Alternatively, such concerns can be addressed by using 
highly fiexible nonlinearities or by comparing several different nonlinearities. 
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r , 50 .. r 1 50 

time [ms] time [ms] 

Figure 6 Nonlinearity mismatch. The original linear kernel (solid line) was estimated (dashed line) assuming 
exponential nonlinearity from the data that was simulated using (a) square or (b) absolute value nonlinearities 
respectively. Original nonlinearities (solid line) and estimated exponential nonlinearity (dashed line) are shown 
in the insets. 
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